Goto

Collaborating Authors

 end-to-end handwritten paragraph recognition


Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition

Neural Information Processing Systems

Offline handwriting recognition systems require cropped text line images for both training and recognition. On the one hand, the annotation of position and transcript at line level is costly to obtain. On the other hand, automatic line segmentation algorithms are prone to errors, compromising the subsequent recognition. In this paper, we propose a modification of the popular and efficient Multi-Dimensional Long Short-Term Memory Recurrent Neural Networks (MDLSTM-RNNs) to enable end-to-end processing of handwritten paragraphs. More particularly, we replace the collapse layer transforming the two-dimensional representation into a sequence of predictions by a recurrent version which can select one line at a time. In the proposed model, a neural network performs a kind of implicit line segmentation by computing attention weights on the image representation. The experiments on paragraphs of Rimes and IAM databases yield results that are competitive with those of networks trained at line level, and constitute a significant step towards end-to-end transcription of full documents.


Reviews: Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition

Neural Information Processing Systems

Positives: -The idea of applying technique to generalize single-line handwriting recognition to multi-line is both an elegant idea and a natural next step. It is unsurprising that this is one of at least two works (mentioned by the authors) attempting similar techniques (not that this is relevant, but I consider this to be the more mature/developed work of the two). The colorized attention figure is particularly gratifying. Areas for improvement: -The comparison to established benchmarks raises more questions than it provides answers. As the authors note, this comparison is apples-to-oranges.


Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition

Bluche, Theodore

Neural Information Processing Systems

Offline handwriting recognition systems require cropped text line images for both training and recognition. On the one hand, the annotation of position and transcript at line level is costly to obtain. On the other hand, automatic line segmentation algorithms are prone to errors, compromising the subsequent recognition. In this paper, we propose a modification of the popular and efficient Multi-Dimensional Long Short-Term Memory Recurrent Neural Networks (MDLSTM-RNNs) to enable end-to-end processing of handwritten paragraphs. More particularly, we replace the collapse layer transforming the two-dimensional representation into a sequence of predictions by a recurrent version which can select one line at a time.